Exploiting the ASR n-best by tracking multiple dialog state hypotheses
نویسنده
چکیده
When the top ASR hypothesis is incorrect, often the correct hypothesis is listed as an alternative in the ASR N-Best list. Whereas traditional spoken dialog systems have struggled to exploit this information, this paper argues that a dialog model that tracks a distribution over multiple dialog states can improve dialog accuracy by making use of the entire N-Best list. The key element of the approach is a generative model of the N-Best list given the user’s true hidden action. An evaluation on real dialog data verifies that dialog accuracy rates are improved by making use of the entire N-Best list.
منابع مشابه
Beyond ASR 1-best: Using word confusion networks in spoken language understanding
We are interested in the problem of robust understanding from noisy spontaneous speech input. With the advances in automated speech recognition (ASR), there has been increasing interest in spoken language understanding (SLU). A challenge in large vocabulary spoken language understanding is robustness to ASR errors. State of the art spoken language understanding relies on the best ASR hypotheses...
متن کاملHypotheses ranking and state tracking for a multi-domain dialog system using multiple ASR alternates
In this paper, we present an approach to improve the accuracy of multi-domain multi-turn spoken dialog system (SDS) by including alternate results from automatic speech recognition (ASR). Often, even if the top ranked result from the ASR is not correct, the correct result may still be available in the NBest list or in the word confusion network (WCN). Thus, the SDS performance can be improved b...
متن کاملThe Dialog State Tracking Challenge
In a spoken dialog system, dialog state tracking deduces information about the user’s goal as the dialog progresses, synthesizing evidence such as dialog acts over multiple turns with external data sources. Recent approaches have been shown to overcome ASR and SLU errors in some applications. However, there are currently no common testbeds or evaluation measures for this task, hampering progres...
متن کاملUser Goal Change Model for Spoken Dialog State Tracking
In this paper, a Maximum Entropy Markov Model (MEMM) for dialog state tracking is proposed to efficiently handle user goal evolvement in two steps. The system first predicts the occurrence of a user goal change based on linguistic features and dialog context for each dialog turn, and then the proposed model could utilize this user goal change information to infer the most probable dialog state ...
متن کاملSemantic parsing using word confusion networks with conditional random fields
A challenge in large vocabulary spoken language understanding (SLU) is robustness to automatic speech recognition (ASR) errors. The state of the art approaches for semantic parsing rely on using discriminative sequence classification methods, such as conditional random fields (CRFs). Most dialog systems employ a cascaded approach where the best hypotheses from the ASR system are fed into the fo...
متن کامل